iiniiiiiiii 

0 Publication number: 0 610 774 A1 



® EUROPEAN PATENT APPLICATION 



^ Application number: 94101440.9 


v^y Int. Cl.°: laU 1 ri iD/ii 


^ Date of filing: 01.02.94 




® Priority: 09.02.93 US 15709 


@ Applicant: BECTON, DICKINSON & COMPANY 




One Becton Drive 


^ Date of publication of application: 


Franklin Lakes New Jersey 07417-1880 (US) 


17.0a94 Bulletin 94/33 






@ Inventor: Verwer, Ben J.H. 


@ Designated Contracting States: 


1034 Narciso Court 


DE ES FR GB IT 


San Jose, California 95129 (US) 




Inventor: Terstappen, Leon W.M.M. 




1048 Colorado Place 




Palo Alto, California 94313 (US) 




0 Representative: Gerbino, Angelo et al 




C/O JACOBACCI-CASETTA & PERANI S-p.A. 




Via Alfieri 17 




1-10121 Torino (IT) 



@ Automatic lineage assignnnent of acute leukemias by flow cytonnetry. 



A method for automatic lineage assignment of acute leukemias. Eight four-parameter list mode data files are 
acquired with a flow cytometer in the following sequence: 1. unstained; 2. isotype controls; 3. CD10 FUG, GDI 9 
PE; 4. GD20 FITG. GD5 PE; 5. CD3 FITG. GD22 PE; 6. GD7 FITG. GD33 PE; 7. HLADR FITG. GDIS PE and 8. 
GD34 FITG, GD38 PE. Rrst. data files 3 - 8 are clustered employing an algorithm based on nearest neighbors. 
The clusters are then associated across the data files to form cell populations, using the assumption of light 
scatter invariance across tubes for each population. The mean positions of each cell population are compared to 
a decision tree which identifies normal cell populations. To identify leukemic cell populations, the algorithm 
eliminates normal cell populations from the data space and the remaining populations are classified as B-lineage 
ALL, T-lineage ALL, AML, AUL. B-GLL or unknown. 
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FIELD OF THE INVENTION 

Th pr s nt invention r lat s to immunoph notyping of nonnal and abnormal blood c II populations by 
flow cytom try. 

5 

BACKGROUND OF THE INVENTION 

Acute leukemias are a fieterogeneous group of diseases arising from X\\e clonal expansion of malignant 
fiematopoietic progenitor cells. The heterogeneity of the disease is evidenced by the large diversity of 

TO antigenic and light scatter profiles of leukemic ceHs in patients diagnosed with acute leukemia. This 
heterogeneity and a poor congelation with normal cell differentiation lead to a lack of consensus in the panel 
of reagents employed for classification and a lack of uniform criteria for lineage assignment. However, the 
antigen profiles in acute leukemias are of clinical importance as the various subgroups identified have been 
associated with different prognoses and serve as a guide for different treatment protocols. 

15 Immunophenotyping by flow cytometry has significantly reduced inter-observer variations in the 
subclassification of leukemias and has been shown to be particularly powerful in discriminating between 
myeloid. B-lymphoid and T-lymphoid leukemias. However, traditional flow immunophenotyping may pro- 
duce biased results due to heterogeneity in leukemias. At the present time there is a lack of consensus in 
the panel of reagents employed for classification and a lack of unrfonm criteria for lineage assignment. 

20 Traditional flow Immunophenotyping is based on finding an optimal light scatter gate followed by 
application of marker settings on the immunofluorescence parameters. The distribution of the cells in a 
display of forward and orthogonal light scatter varies considerably between leukemias, however, and does 
not fit the normal lymphocyte, blast, monocyte and granulocyte light scatter regions. In addition to 
difficulties in assessing the appropriate light scatter gate, there are complications arise when attempting to 

25 define "negative" versus "positive" immunofluorescence staining in immunophenotyping of leukemias. 

In multidimensional flow cytometric analysis the bias which is introduced by employing gates on light 
scatter parameters is eliminated because all parameters are analyzed simultaneously. Cluster algorithms 
(Salzman. G.C., et al. 1991. Cytometry Suppl. 5:64). principal components analysis (Leary, J.F.. et al. 1988. 
Cytometry Suppl. 2:99), neural nets (Frankel, D.S., et al. 1989. Cytometry 10:540) and PAINT-A-GATE 

30 analysis (U.S. Patent No. 4.845,653) are among the approaches used for multidimensional analysis. These 
algorithms permit a more precise identification of cell populations in the multidimensional data space. All 
require listmode data files in which identical reagents are used. The number of reagents needed for most 
clinical applications, however, far exceeds the number of available fluorochromes and therefore requires the 
use of multiple reagent combinations, i.e., running a multi-tube panel with two to three reagents at a time. 

35 The necessity for a large panel of monoclonal antibodies to achieve an optimal lineage assignment of 
acute leukemias forces the investigator to stain multiple samples using either one. two or three color 
immunofluorescence. The presence of multiple normal and leukemic cell populations in bone marrow or 
peripheral blood from patients with leukemia results in a variable numtier of identifiable cell populations in 
the samples stained with different antibodies, ft is therefore difficult for the investigator to employ objective 

40 criteria to assess the antigenic profile of the leukemia. Although the optimal solution to the problem is to 
determine the antigenic profile in one tube stained with all the required monoclonal antibodies, at the 
present time not enough different fluorochromes are available. 

The present invention employs a noveJ data analysis method which associates cell populations across 
tubes and links the positional information of these cell populations to a decision table for classification as 

45 normal cells (monocytes, neutrophils, eosinophils, basophils, NK cells, T-lymphocytes and B-lymphocytes) 
or as leukemic cell populations typical of B-lineage ALL, T-lineage ALL, AML, AUL and B-CLL. This 
approach to data analysis can be generalized to any combination of flow experiments which require data 
analysis across multiple tubes. The instant use for assigning lineages to acute leukemias is provided by 
way of example. 

50 

SUMMARY OF THE INVENTION 

The pr sent invention provides a novel data analysis t chniqu which overcom s th difficuiti s 
associated with the analysis and interpr tation of data g n rated by analysis of multipl aliquots of a 
55 sample, .g., th lineage assignment of acut leuk mias. Th data analysis techniqu is based on two 
concepts: 1. Identification of cell clusters, consisting of c lis which have similar characteristics within one 
sample and 2. Id ntification of c II populations, consisting of c lis which xhibit similar charact ristics ov r 
all samples. In on embodiment, paired combinations of monoclonal antibodies (CD10/CD19, CD20/CD5, 
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CD3/CD22, CD7/CD33, HLA-DR/CD13 and CD34/CD38) conjugated to fluor sc nt labels ar used for 
immunophenotyping of acute leuk mias by fluor sconce staining. Eight data files for fluoresc nee and light 
scatter ar collected on a flow cytometer for each blood or bone marrow sampi : an unstained sampi , a 
sampi stained with appropriate isotype control antibody-fluorochrom conjugat s and samples stained with 

5 the six lat»eled antibody combinations. 

The method first clusters the data files utilizing a clustering algorithm. A coordinate system is used to 
determine the position of each cell cluster in the correlation of forward and orthogonal light scatter. The 
immuntDfluorescence intensity of each cell cluster is determined by comparing the background staining of 
cells with a common parameter in the unstained and isotype control sample. The clusters are then linked 

10 across i the data files to form cell populations, using the parameter profiles which are common across the 
data files. The location of each of the cell populations in the now fourteen dimensional feature space is 
compared with a decision table to make the lineage assignment. Residual erythrocytes, cell debris, normal 
T lymphocytes, B lymphocytes, NK cells, neutrophils, eosinophils, basophils and monocytes are each 
expected in a specific region in the fourteen dimensional feature space. By adding boundaries to their 

15 frequency, the normal cell populations can be identified in leukemic bone marrow or blood samples. The 
positions in the fourteen dimensional feature space of the cell populations which do not fulfill the normal 
criteria are fed to the decision table which outputs their assignment as B-lineage ALL. T-lineage ALL. AML, 
AUL, B-CLL or as a population of cells of unknown identity. 

This data analysis technique employs a new concept for the analysis of flow data in that positional 

20 information of cell clusters is matched across multiple aliqouts of a sample. H provides the advantage of 
more rapid analysis than is possible using conventional immunophenotyping techniques. 

DESCRIPTION OF THE DRAWINGS 

25 Fig. 1 shows the result of clustering of tubes 3. 4. and 5 of table 1 for a normal sample. Clustering is 
nonparametric and does not use a priori information about cells. At this stage the clusters have not yet been 
identified. 

Rg. 2 shows the coordinate system for light scatter profiles, i.e.. the bivariate histograms of forward and 
orthogonal light scatter. The orthogonal light scatter is transformed using a third-order polynomial to 
30 increase the separation between the cell clusters. The 15*15 grid represents the internal resolution used by 
the algorithms. 

Rg. 3 shows the coordinate system for the immunofluorescence identifiers of the cell clusters. The 
identifiers are composed of two characters. The first signifies the staining of the cells with the FITC labeled 
antit)ody. The second signifies the staining with the PE labeled antibody. A identifier means that the 
35 cells did not stain. A identifier means that the cells stained partially. A " + " identifier means that the 
cells stained fully. 

Rg. 4 shows the clustering of the listmode data files of a patient with a B-lymphoid acute leukemia. 
Colors are assigned in order of cluster size and cannot be used to link clusters from one panel to another. 
Rg. 5 shows the cell populations identified by matching the cell clusters illustrated in Rg. 4, 
40 Fig. 6 shows clustering of the listmode data files of a patient with a T-lymphoid acute leukemia. 
Rg. 7 shows the cell populations identified by matching the cell clusters illustrated in Rg. 6. 
Fig. 8 shows clustering of the listmode data files of a patientn with an acute myeloid leukemia. 
Rg. 9 shows the cell populations identified by matching the cell clusters illustrated in Rg. 8. 

45 DETAILED DESCRIPTION OF THE INVENTION 

In the inventive data analysis methodology a distinction is made between cell clusters , consisting of 
cells which have similar characteristics within one tube, and cell populations , consisting of cells which 
exhibit similar characteristics over all tul)es. Cell populations may stain differently in different tubes, but the 
50 cells can be associated on the basis of one or more features which remain the same across all tubes, e.g., 
light scatter properties, the numtier of cells or a common fluorescence parameter. For example, a possible 
cell population is identified when the intersection of the scatter profiles of six clusters, one per tube, is not 
mpty. Th intersection in this case is calculat d on the basis of th light scatter profiles of th cell clusters. 
Using light scatt r as th common param t r across tub s is pr f rred for its simplicity and because light 
55 scatter t nds to be less variabi than other paramet rs. 

Th distinction b tween cell clusters and cell populations is made to compensat for the lack of a 
sufficient number of fluor sc nt colors. For this reason, c II populations must be inf rred from multipl data 
files. For xampi , if there is a ne d to measur four colors for a sampi with two cell populations and if the 
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flow cytometer only allows simultaneous measurement of two fluoresc nee parameters, th xperim nt 
must be split into two flow cytometer runs. In this cas , the cell populations may show differ nt 
charact ristics in one tube (two clusters) but th sam charact ristics in th other (on dust r). lin practic , 
the number of tubes necessary depends on how many fluorescenc labels can be m asured slmulta- 

5 neously by the instrument. The method of analysis disclosed herein is not limited to two fluorescence colors 
per tube, and when flow cytometers are used which allow detection of three or more fluorescence 
parameters additional fluorescence labels may be included in the procedures. 

In certain circumstances cell populations may not be deducible from the data. For example, if two 
clusters are found in a first tube (la and 1b). two clusters in a second tube (2a and 2b) and the clusters 

70 have the same scatter profile, it cannot be deduced whether la and 2a or 1a and 2b are from the same cell 
population. Even clusters of different sizes do not resolve this issue as cells which form a cluster in one 
tube may not necessarily do so in another tube. For example, if the number of cells in 1a>1b and in 2a>2b. 
there could be 3 cell populations (X.Y and Z). That is, X and Y may combine in tube 1 to form cluster la. 
with Z forming 1b, and in tube 2 Y and Z combine to form 2a. with X forming 2b. 

15 The solution to this problem is provided by the application domain which assumes that the cells are 
part of a normal population which exhibits expected normal staining properties. If this hypothesis cannot be 
falsified, the cells are assumed to be normal and classified as such. Normal cells are then removed from 
subsequent analysis for identification of abnormal cells. No assumptions can be made about the remaining 
clusters, and all cell populations are listed. Analysis stops after the possible populations have been 

20 identified. In most cases each possible cell population is a real cell population. However, if multiple 
populations have sufficiently similar light scatter profiles, the inventive algorithm cannot distinguish them. 

Identification of Cell Clusters In a Listmode Data File 

25 Clustering according to the invention may use any of the clustering algorithms known in the art. These 
include, for example, the isodata (G. H. Ball and D. J. Hall. 1966. Intl. Nat'l. Commun. Conf., Philadelphia) 
and K-means algorithms. These and other useful clustering algorithms are described in M. R. Anderberg, 
Cluster Analysis for Applications, Academic Press, New York/London, 1973; P. H. A. Sneath and R. R. 
Sokal, Numerical Taxonomy . Freeman Publishers. San Francisco, 1973; and J. A. Hartigan, Clustering 

30 Algorithms , John Wiley Publishers, New York, 1975. In a prefen^ed embodiment, the clustering algorithm is 
a modified algorithm based on the mutual nearest neighbor value (MNN) (Chidananda, G.K., et al. 1978. 
Pattern Recognition 10:105). The MNN of two cells is the sum of the ranks of the cells in their respective 
nearest neighbor lists. Two cells are assigned to the same cluster if the MNN is smaller than a preselected 
threshold T. In the unmodified MNN algorithm the clusters would be determined based on this data alone. 

35 However, for flow data it is preferred to use a modified MNN algorithm. 

The modification to the MNN algorithm assists in compensating for noise in flow data which can cause 
errors in clustering. After finding a preselected number of nearest neighbors of each cell (k) (Kim, B.S.. et 
al. 1986. IEEE Transactions on Pattern Analysis and Machine Intelligence 8:761), the distance between the 
cell and each of these neighbors is calculated. The procedure is repeated for all cells in the data file. After 

40 sorting the list in order of distance, cells are merged in order of increasing distance to form clusters. Two 
cells (and the clusters they belong to) are not merged if their clusters exceed a critical size S and the 
distance between the cells (F) is substantially larger than the average distance between cells In each of the 
clusters. Optionally, a cleanup can be performed after merging in which remaining cells are assigned to 
clusters close to them. This last step is not usually required for diagnostic applications but is preferred for 

45 applications here absolute cell counts are required. 

The parameters which can be varied by the user in tiiis clustering algorithm to optimize results for a 
particular application are k = number of neighbors. T = threshold at which two cells are considered 
neighbors. S = size of the cluster and F = separation factor. The sensitivity of the algorithm to the 
parameters k and T is low. In general, any value of k between 4 and 6 and T between k and 2*k will give 

50 similar results. For analysis of the leukemic data files described herein, K = 5. T = 7, S = 1% and F = 1.5-2. 
These values were obtained by adjustment of the parameters until clusters were found which could be 
perceived as clusters. 

The separation factor F has more weight in the analysis. Different separation factors F may be appli d 
in different sample tubes, depending on the antitx)dy characteristics of the tubes. In th CD7/CD33 and 
55 CD34/CD38 tubes of th xamples below F was set at 1.5. In the oth r tubes F was 2.0. This r suited in 
mor clusters identified in the CD7/CD33 and CD34/CD38 tub s and was necessary because of reduced 
discrimination betw n cell dust rs in thos tutjes. That is, F is smaller in th tub with CD7-FITC b caus 
th separation between T lymphocyt s and NK cells xpressing CD7 and B lymphocytes with a similar light 



4 



EP 0 610 774 A1 



scatter profile but not expressing CD7 is I ss than for T lymphocytes identified with CD3. 
Coordinate System for C II Cluster Location 

5 Positional information for the cell clusters is used to established their identity. To optimize the 
distribution of cell clusters in the light scatter display the orthogonal light scatter parameter is preferably 
transformed according to the polynomial function described by L.W.M.M. Terstappen, et al. (1990. 
Cytometry 11:506). The light scatter profile is then defined as a 2-dimensional histogram quantized in 15*15 
resolution. To eliminate inter-experiment variability, each experiment is paired with an analysis of normal 

10 bone marrow or normal peripheral blood. The mean position of the normal lymphocytes is used to shift the 
scatter data of the leukemic cells to a position such that the mean of normal T cells falls at absolute 
channel numbers 1 10 for FSC and 25 for SSC on a scale of 0-255. 

To specify expected scatter profiles of nonmal cells, a coarser coordinate system may be established 
(5*5). Fig. 2 illustrates the regions in which normal erythrocytes, lymphocytes, stem cells, basophils. 

75 monocytes, neutrophils and eosinophils are located. A cluster will only be classified as one of these normal 
cell populations when its mean light scatter value is located within the defined region. The position of the 
light scatter regions for assignment of leukemia cell clusters is also indicated in Rg. 2. 

The position of cell clusters in the correlation of two immunofluorescence parameters is assigned one of 
nine fluorescence identifiers as illustrated in Rg. 3. The assignment is dependent on the position of the 

20 cells with the same scatter profile in an isotype control and is based on analyses of the 1-D im- 
munofluorescence histograms of the cells in one cluster. The median of the background staining for both 
FITC and PE is determined in the isotype control and then compared to the median of the cell cluster in the 
stained samples. The cluster is considered to express the antigen totally when the ratio between the median 
of the stained and the isotype sample is larger than two. The cell cluster is considered to express the 

25 antigen partially when more than 20% of the cells are positive. A cell is considered positive when its 
fluorescence intensity exceeds an estimated 0.99 percentile. This 0.99 percentile is defined as the median 
value plus twice the difference between the 0.87 percentile and the median as it would be for a log normal 
distribution. This approach to determine whether or not a cell is positive is less sensitive to noise then a 
direct determination of the 0.99 percentile. When the criteria for positive staining are not fulfilled, the cell 

30 cluster did not shift significantly and is considered not to express the antigen defined by the fluorochrome- 
labeled antibody. 

Identification of Normal Cell Populations in Multiple Listmode Data RIes 

35 The final assessment for cell populations is based on parts of the cell clusters. Those parts become 
distinguishable in a combinatorial process in which all clusters are tested gainst each other. In this process 
all possible pairwise combinations of cell clusters (one per tube) are considered, e.g.. cluster 1 of tube 3 
with cluster 1 of each of tubes 4 - 8, cluster 2 of tube 3 with cluster 1 of each of tubes 4-8, etc. The 
combination process sets the minimum value for all bins in the 15*15 scatter profiles of the clusters in the 

40 current combination, implemented as a tree structure. That is, bin 1 in the newly constructed 15*15 scatter 
profiles of the population is the minimum value of bin 1 of the 15*15 scatter profiles of the six cell clusters. 
The other bins are similarly set. Preferably, the 15*15 scatter profiles of the cell clusters are smoothed with 
a 3*3 uniform filter to compensate for statistical fluctuations in the scatter data of one cell population over 
the six tubes. If the resulting scatter profile is not empty (>1% of the cells) a possible cell population is 

45 identified. 

The properties of the possible cell population are then determined: 1 . the number of cells. 2. the area of 
the scatter profile (the number of bins in the 15*15 histogram which have cells in them). 3. the mean 
fluorescence values of the unstained and isotype controls, and 4. the immunological profile of the cells. The 
immunological profile of a cell population is defined by a set of fluorescence identifiers. Each cluster 

50 belonging to the population (maximally one per tube) receives a fluorescence identifier. The identifier is 
based on the fluorescence intensity of those cells of the clusters which fall within the scatter profile of the 
cell population. In more conventional terminology, the scatter profile of the population defines a gate for the 
celts of each of the cell clusters. 

TabI 1 shows th criteria for th prop rti s of normal c II populations, d termin d using a t st s t of 

55 normal data files. When a normal cell population is identified, th scatt r profiles of the clusters contributing 
to that cell population ar updated by subtracting the scatter profile of the normal population from each of 
the scatter profiles of th six clusters. 
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Identification of Abnormal C II Populations in Multipl Listmode Data HI s 

Once all c lis t)elonging to th normal populations are r mov d. th charact ristics of all possibi 
remaining c II populations ar checked in combination against a tabi of diagnos s (Table 2). In a first 

5 stage of analysis, this table preferably takes into account all tubes, although the antibodies in some tubes 
might not be relevant to the diagnosis. This approach provides more complete information to the user as 
each population is identified by its clusters in each tube. In a second stage of analysis, only tubes which 
have an entry on each of the lines in table 3 are checked. This approach maximizes diagnostic 
effectiveness. This second stage is preferred in cases where a small cluster is obscured by a larger 

70 population of normal cells. Matching a small cluster to the appropriate cell population may be impossible 
because the nomnal cells, the number of which can differ statistically in the various tubes, are removed. 
Therefore, when there is evidence for an anomaly, it is preferably reported. 

EXAMPLE 1 

15 

AUTOMATED LINEAGE ASSIGNMENT OF ACUTE LEUKEMIAS 

Mononuclear cells from bone marrow aspirates of B-lineage ALL, T-lineage ALL and AML patients were 
separated on FICOLL-HYPAQUE (Sigma Chemical Co., St. Louis, MO) and immunofluorescently labeled 
20 following ttie protocol of the Acute Leukemia Phenotyping Kit (Becton Dickinson Immunocytometry Systems 
(BDIS), San Jose, California) . The antibody combinations used were as shown in Table 3: 

TABLE 3 



AntitxKJy Combinations 




FITC Labeled 


PE Labeled 


Tube 1 


Unstained 


Unstained 


Tube 2 


Isotype (lgG2a) 


Isotype (IgGI) 


Tube 3 


CDIO(CALLA) 


CD19(Leu12) 


Tube 4 


CD20 (Leu16) 


CDS (Leu1) 


Tubes 


CD3 (Leu4) 


CD22 (Leu14) 


Tube 6 


CD7 (Leu9) 


CD33 (LeuM9) 


Tube? 


HLADR 


CD13(LeuM7) 


Tube 8 


CD34 (HPCA-2) 


CD38 (LeulT) 



Flow cytometric analysis was performed on a FACSCAN (BDIS). The instrument was prepared for 
sample analysis using CALIBRITE Beads and AUTOCOMP software (BDIS). Data acquisition was performed 

40 using LYSYS 2.0 Software (BDIS). Fonward light scatter, orthogonal light scatter and the two log (4 decade) 
amplified fluorescence signals were measured for 10000 cells and the data stored in listmode data files. 
Data from SOOO cells were used for analysis to reduce processing time. Forward and orthogonal light scatter 
detectors were adjusted using normal blood as a control during instrument setup. Lymphocytes were found 
between channels SO and 150 for FSC and just above channel 0 for SSC. Normal sample data was saved 

45 for later calibration of sample scatter data. The data analysis algorithms were developed using C + + on 
SUN Sparcstations and Macintosh PC's. 

Rg. 4 shows the clustering result of SOOO cells of a patient with an acute B-lymphoid leukemia. In each 
of the fluorescence displays the cell clusters found were assigned a color in order of decreasing percentage 
per tube (plotting colors in one tube have no relationship to plotting colors in other tubes). Scatter positions 

50 and immunofluorescence identifiers for the clusters are shown in the upper right corner of each of the plots. 
The clusters are shown in immunofluorescence dotplots but were identified in four-dimensional space. 

For example, in Rg. 4B (tube 3) five clusters were found: 1. a cluster plotted in red with a frequency of 
70.6%. located in a light scatter r gion C2. staining with CD19 but not with CD10; 2. a cluster plotted in 
gr en with a frequency of 11.1%, locat d in a light scatt r region E3, not staining with CDIO and CD19; 3. a 

55 cluster plotted in blue pres nt in a frequency f 8.5%, located in a light scatter r gion C2. not staining with 
CDIO and CD19; 4. a cluster plotted in purple pr s nt in a frequency of 5.1%, located in a light scatter 
r gion E3, staining with CD19 but not with CDIO and S. a cluster plotted in dark blu with a fr quency of 
2,2% and locat d in a light scatter region E5 and not staining with CDIO and CD19. Immunofluorescenc 
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identifiers are pr liminary and wer not us d to identify c II populations. 

After the data fil s wer clustered, the algorithm was used to search for normal c I) populations and 
liminat th m from th analysis by subtracting the population scatter histogram from th dust r scatt r 
histogram. The algorithm then searched for th presence of abnormal cell populations. 

5 In Fig. 5, the result of the population search is illustrated with the colors now matched across tubes. 
The cell populations found were normal T cells (4.3%) plotted in green, normal monocytes plotted in blue 
(6.0%) and a cell population identified as B-lineage ALL plotted in red (59.0%). The percentages quoted for 
the populations are based on the overlap of the clusters in the scatter space and therefore represent a 
lower boundary. For example, although the frequency of the predominant cell cluster was greater than 

10 66.2% in all of the samples, the frequency of the leukemic cell population which is composed of portions of 
the various cell clusters is only 59.0%. The initial identifiers assigned to the clusters may differ from the 
identifiers assigned to the clusters of the cell populations because the cells of each cluster are gated with 
the scatter histogram of the population. 

The data files of a patient with acute T-lymphoid leukemia were similarly clustered (Rg. 6). In Rg. 6B 

15 (tube 3) three clusters were found, in Rg. 6C (tube 4) three, Rg. 6D (tube 5) six, Rg. 6E (tube 6) five. Rg. 
6F (tube 7) four and in Fig. 6G (tube 8) five. Correlating the positional information across tubes indicated the 
presence of a population of cells which fulfilled the criteria of normal T lymphocytes, plotted in green in Rg. 
7. A second population of cells was found and classified as T-lineage ALL. plotted in red in Rg. 7. In this 
case, the frequency and position of the cluster in the sixth tube (CD34/CD38) discriminated between normal 

20 T-cells and malignant T-cells. 

The clustering of the data files of a patient with acute myeloid leukemia is shown in Fig. 8. In Rg. 8B 
(tube 3) four clusters were found, in Fig. 8C (tube 4) two. Fig. 8D (tube 5) four. Fig. 8E (tube 6) four. Rg. 8F 
(tube 7) four and in Rg. 8G (tube 8) five. By correlating the positional information across tubes three cell 
populations were found, as shown in Rg. 9. The cell population plotted in green was classified as a normal 

25 T-cell population, the population plotted in dark blue contained monocytes and the cell population plotted in 
red was classified as AML. In this experiment, cell population classified as monocytes consisted of two 
populations which differed slightly in their locations. Additionally, two cell populations identified as AML 
were identified which only differed in the sixth tube. For a clinical report, these populations could be added 
together. Reporting all populations found, however, more clearly illustrates the algorithm used to find the 

30 cell populations. In this case, the cells classified as monocytes most likely belong to the leukemia. However, 
the criteria to classify these cell populations as leukemic were not met. 
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Table 1 Criteria for normal cells 



immunological profile^ 



cell type 


scat. 


auto- 


max % 


CDIO 


CD20 


CD3 


CD7 


HLADR 


CD34 


prof. ^ 


fluor.^ 


cells 


CD 19 


CDS 


CD22 CD33 


CD 13 


CD38 


ETyth./debris 


SI 


any 


90 


ffl 


feCu 




i m 

] m 


■fn 


■n 


T-cells4 


S2 


low 


80 


m 


H 




m 


m 


NK-cells^ 


S2 


low 


20 


m 


m 


s 


] m 


m 


m 


B-cells4 


S2 


low 


20 


m 


a 


\ t+ 


m 




Stemcells 


S3 


low 


1 


m 


m 




] m 


m 


a 


Basophils 


S4 


any 


5 


m 


m 






m 




Monocytes^ 


S5 


any 


10 


m 


m 


e 


] m 






Neutrophils 


S6 


any 


80 


m 


m 






m 


m 


Eosinophils 


S7 


high 


10 


ffl 






\ u 


m 


m 



^ Scatter profile as defined in figure 1. 

^ High requires the mean unstained channel number to be larger than 64, low requires a value 
less than 64 

3 Immunological profile as defined in figure 2. 
The median of a cluster in a tube has to fall in one of the black colored regions. 

e.g. means that the cluster should be negative for both antibodies. 

I e..g means that the cluster should be positive for FLl and that FL2 is irrelevant 

4 T-Cells, NK-cells, B-cells and Monocytes cannot be scattered over more 

than 25% of the total dotplot area. 
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Table 2 Criteria for leukemic cells. 



Leukemia 
lineage 



scat, 
prof. 



auto- 
fiuor. 



min % 
cells I 



immunological profile 

CD 10 CD20 CD3 CD7 HLADR CD34 
CD19 CDS CD22 CD33 CD13 CD38 



B-CLL 



B-ALL 



S8 



low 



10 



S8 
S8 
S8 
S8 
S8 
S8 
S8 
S8 
S8 
S8 
S8 
S8 
S8 
S8 
S8 
S8 
S8 



low 
low 
low 
low 
low 
low 
low 
low 
low 
low 
low 
low 
low 
low 
low 
low 
low 



5 
10 

5 

10 
5 

10 
5 

10 



m m 



m m m 



M ffl M 















m 






m 






m 






m 








1^ 







m 
a 
a 
a 
a 
a 
a 
a 
a 



^ The percentages are low if the requirements of the immunological profile are sirict. 

^ If no region for the cluster is specified the position of the cluster in that tube is not tested. 

3 If CD34 positive, minimum % of leukemic cells goes down to 1%. 
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Table 2 (continued). Criteria for leukemic cells 



immunological profile 



Leukemia 
lineage 



scat. 
prof. 



autor 
fluor. 



min % 
cells 



CD 10 
CD 19 



CD20 
CD5 



CD3 
CD22 



CD7 
CD33 



HLADR 
CDI3 



CD34 
CD38 



T-ALL 



AML 



AUL 



S8 
S8 
S8 
S8 
S8 
S8 



low 
low 
low 
low 
low 
low 



S9 
S9 
S9 
S9 
SIO 
SIO 



S2 



low 



m 
m 



E 



m 



m 



m 



m 

E 
E 
E 
E 

m 



m 



s 



rm rm 

HB HiL 



ffl 
ffl 



Claims 

1. A method for analyzing data generated by flow cytonnetric analysis of multiple aliquots of a sample 
containing cells to be analyzed, wherein the cells in each aliquot are stained with at least two 
monoclonal antibodies conjugated to fluorochromes which are distinguishable from each other by flow 
cytometric analysis, the method comprising: 

a) acquiring listmode data files for light scatter and fluorescence for each aliquot; 

b) identifying cell clusters in each aliquot by cluster analysis of the data files for each aliquot, and; 

c) identifying cell populations in the sample by linking the cell clusters across the data files on the 
basis of at least one common parameter. 



2. The method according to Claim 1 further comprising the step of determining the lineage of the cell 
population by comparing the fluorescence and light scatter characteristics of the population to 
fluorescence and light scatter characteristics expected for a selected lineage. 

3. The method according to Claim 2 further comprising removing normal cell population data from the 
data files and determining the lineage of remaining abnormal cell populations. 



4. The method according to Claim 3 wherein the lineage of remaining leukemic cell populations is 
determined. 



5. A method for d t rmining th lin ag of acute I uk mia celts in a sampi by flow cytom trie analysis 
comprising: 

a) staining th cells in each on of multiple aliquots of the sampI with at least two monoclonal 
antibodies conjugat d to fluorochromes, ach fluorochrom being distinguishabi from th other 
fluorochromes in the aliquot by flow cytometric analysis; 
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b) acquiring listmod data files for light scatter and fluorescenc for ach aliquot; 

c) identifying cell clusters in each aliquot by nearest neighbor analysis of th data files for each 
aliquot; 

d) identifying cell populations in the samp! by linking the cell clusters across th data files on the 
basis of common light scatter properties; 

e) identifying nomnal cell populations by comparing the fluorescence and light scatter characteristics 
of the cell populations to light scatter and fluorescence characteristics expected for normal celts; 

f) removing the data for the normal cell populations from the acquired data, and; 

g) determining the lineage of remaining abnormal cell populations by comparing the fluorescence 
and light scatter characteristics of the abnormal populations to fluorescence and tight scatter 
characteristics of a selected leukemic cell lineage. 

The method of Claim 5 wherein each of the aliquots is stained with an antibody conjugated to FITC and 
an antibody conjugated to PE. the antibodies tieing specific for CD10, GDI 9. CD20. CD5. CD3, CD22. 
CD7, CD33. HLADR. CD13, CD34 and CD38, and the fluorescence and light scatter characteristics of 
the abnormal cell populations are compared to the fluorescence and light scatter characteristics of a 
leukemic cell lineage selected from the group consisting of B-lineage ALL. T-tineage ALL, AML, AUL, 
and B-CLL. 
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